Produce forecasts for the following series using whichever of
NAIVE(y),SNAIVE(y)orRW(y ~ drift())is more appropriate in each case:
- Australian Population (
global_economy)- Bricks (
aus_production)- NSW Lambs (
aus_livestock)- Household wealth (
hh_budget).- Australian takeaway food turnover (
aus_retail).
global_economy)aus_production)aus_livestock)hh_budget).aus_retail).Use the Facebook stock price (data set gafa_stock) to do the following:
- Produce a time plot of the series.
- Produce forecasts using the drift method and plot them.
- Show that the forecasts are identical to extending the line drawn between the first and last observations.
- Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?
We have a problem here: some dates are missing (that’s why we have xlabel = Date [!]).
I’m going to fix this with linear interpolation.
fb_interpolation <- fb_raw %>%
update_tsibble(index = Date, regular = TRUE) %>%
fill_gaps() %>%
select(Close) %>%
mutate(Close = approx(Date, Close, Date)$y)
autoplot(fb_interpolation, Close)Other solution could have been to get the prices at the start of each week, per each year … ?
fb_interpolation_fit <- fb_interpolation %>%
model(RW(Close ~ drift()))
fb_interpolation_fit %>%
forecast(h = 100) %>%
autoplot(fb_interpolation) +
labs(title = "Symbol = FB",
subtitle = "Forecast with DRIFT method") +
scale_x_date(date_breaks = "1 year",
date_labels = "%Y.")Show that the forecasts are identical to extending the line drawn between the first and last observations.
manual_drift_forecast <- function(h) {
history_length <- length(fb_interpolation$Close)
value_history_start <- fb_interpolation$Close[1]
value_history_end <- fb_interpolation$Close[history_length]
h_term <- (value_history_end - value_history_start) / (history_length - 1)
return(value_history_end + h * h_term)
}
final_check <- fb_interpolation_fit %>%
forecast(h = 100) %>%
as_tibble() %>%
select(Date, .mean) %>%
purrr::set_names(c("Date", "DRIFT_MODEL")) %>%
mutate(h = seq_along(Date),
DRIFT_MANUAL = sapply(h, manual_drift_forecast),
DIFFERENCE = DRIFT_MANUAL - DRIFT_MODEL)
final_check %>%
filter(DIFFERENCE != 0)## # A tibble: 17 × 5
## Date DRIFT_MODEL h DRIFT_MANUAL DIFFERENCE
## <date> <dbl> <int> <dbl> <dbl>
## 1 2019-01-13 132. 13 132. -2.84e-14
## 2 2019-01-29 132. 29 132. -2.84e-14
## 3 2019-02-03 133. 34 133. -2.84e-14
## 4 2019-02-08 133. 39 133. -2.84e-14
## 5 2019-02-19 133. 50 133. -2.84e-14
## 6 2019-02-24 133. 55 133. -2.84e-14
## 7 2019-03-01 134. 60 134. -2.84e-14
## 8 2019-03-06 134. 65 134. -2.84e-14
## 9 2019-03-12 134. 71 134. -2.84e-14
## 10 2019-03-17 134. 76 134. -2.84e-14
## 11 2019-03-22 134. 81 134. -2.84e-14
## 12 2019-03-27 135. 86 135. -2.84e-14
## 13 2019-03-28 135. 87 135. -2.84e-14
## 14 2019-04-01 135. 91 135. -2.84e-14
## 15 2019-04-02 135. 92 135. -2.84e-14
## 16 2019-04-06 135. 96 135. -2.84e-14
## 17 2019-04-07 135. 97 135. -2.84e-14
Floating point differences, who cares …
Try using some of the other benchmark functions to forecast the same data set. Which do you think is best? Why?
I still think drift method is best, since we don’t have any seasonality in these time series.
Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.
# Extract data of interest
recent_production <- aus_production %>%
filter(year(Quarter) >= 1992)
# Define and estimate a model
fit <- recent_production %>% model(SNAIVE(Beer))
# Look at the residuals
fit %>% gg_tsresiduals()Couple of points should be made:
lag = 4, but we can ignore it.We can confirm this by Shapiro–Wilk test:
augment(fit) %>%
pull(.fitted) %>%
shapiro.test(.)##
## Shapiro-Wilk normality test
##
## data: .
## W = 0.86827, p-value = 2.667e-06
Nevertheless, forecasts will probably be quite good, but prediction intervals that are computed assuming a normal distribution may be inaccurate.
Repeat the previous exercise using the Australian Exports series from
global_economyand the Bricks series fromaus_production.Use whichever of NAIVE() or SNAIVE() is more appropriate in each case.
global_economy)We cannot use SNAIVE() here since we have data on year-by-year basis
(there are no seasons to repeat). So, the only reasonbable method is
NAIVE, though I would prefer here
RW(y ~ drift()).
Now let’s see Paul Allen’s forecast method.
aus_production)IMO, NAIVE is better here, since SNAVE
gives non-constant variance, right skewed distribution of residuals
etc.
Produce forecasts for the 7 Victorian series in aus_livestock using SNAIVE(). Plot the resulting forecasts including the historical data. Is this a reasonable benchmark for these series?
Yes, these are reasonable benchmarks.
Argument could be made for:
Are the following statements true or false? Explain your answer.
For your retail time series (from Exercise 8 in Section 2.10):
Create a training dataset consisting of observations before 2011 using:
set.seed(12345678)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries_train <- myseries %>%
filter(year(Month) < 2011)Check that your data have been split appropriately by producing the following plot.
autoplot(myseries, Turnover) +
autolayer(myseries_train, Turnover, colour = "red")Fit a seasonal naïve model using SNAIVE() applied to your training data (myseries_train).
fit <- myseries_train %>%
model(SNAIVE(Turnover))Check the residuals.
fit %>% gg_tsresiduals()No, the residuals do not appear to be uncorrelated and normally distributed.
Produce forecasts for the test data
fc <- fit %>%
forecast(new_data = anti_join(myseries, myseries_train))
fc %>% autoplot(myseries)Compare the accuracy of your forecasts against the actual values.
accuracy(fc, myseries)## # A tibble: 1 × 12
## .model State Indus…¹ .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(Tu… Nort… Clothi… Test 0.836 1.55 1.24 5.94 9.06 1.36 1.28 0.601
## # … with abbreviated variable name ¹Industry
The forecast on test data seems to be accurate.
How sensitive are the accuracy measures to the amount of training data used?
As we can see from the graph, there exists optimal point to the size of training set (72) that brings the errors to the minimum.
Consider the number of pigs slaughtered in New South Wales (data set aus_livestock).
- Produce some plots of the data in order to become familiar with it.
- Create a training set of 486 observations, withholding a test set of 72 observations (6 years).
- Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
- Check the residuals of your preferred method. Do they resemble white noise?
Let’s check the actual time series:
Seasonality is present, trend follows random walk.
Combined with this:
aus_pigs %>%
features(Count, feat_stl) %>%
transpose()## [[1]]
## [[1]]$trend_strength
## [1] 0.950971
##
## [[1]]$seasonal_strength_year
## [1] 0.6424557
##
## [[1]]$seasonal_peak_year
## [1] 6
##
## [[1]]$seasonal_trough_year
## [1] 7
##
## [[1]]$spikiness
## [1] 9148187789
##
## [[1]]$linearity
## [1] -49899.04
##
## [[1]]$curvature
## [1] -445937.5
##
## [[1]]$stl_e_acf1
## [1] -0.1929197
##
## [[1]]$stl_e_acf10
## [1] 0.4107981
… we can say that:
trend_strength: The trend component is strong (meaning
that trend component contains most of the variation compared to the
variation of trend and remainder component).seasonal_strength_year: Seasonal component is also
strong, but not as much as trend component.linearity & curvature: trend is
negative.Create a training set of 486 observations, withholding a test set of 72 observations (6 years).
# time series split is from "preamble.R" -> custom function.
tt_split <- time_series_split(aus_pigs, 486)
tt_split## $train
## # A tsibble: 486 x 2 [1M]
## Month Count
## <mth> <dbl>
## 1 1972 srp 97400
## 2 1972 kol 114700
## 3 1972 ruj 109900
## 4 1972 lis 108300
## 5 1972 stu 122200
## 6 1972 pro 106900
## 7 1973 sij 96600
## 8 1973 vlj 96700
## 9 1973 ožu 121200
## 10 1973 tra 99300
## # … with 476 more rows
## # ℹ Use `print(n = ...)` to see more rows
##
## $test
## # A tsibble: 72 x 2 [1M]
## Month Count
## <mth> <dbl>
## 1 2013 sij 71300
## 2 2013 vlj 69700
## 3 2013 ožu 79900
## 4 2013 tra 74300
## 5 2013 svi 87200
## 6 2013 lip 71200
## 7 2013 srp 86200
## 8 2013 kol 78000
## 9 2013 ruj 71200
## 10 2013 lis 78200
## # … with 62 more rows
## # ℹ Use `print(n = ...)` to see more rows
Try using various benchmark methods to forecast the training set and compare the results on the test set. Which method did best?
Which method did best?
accuracy(fc, tt_split$test) %>%
arrange(RMSE)## # A tibble: 4 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 F_DRIFT Test -4685. 8091. 6967. -7.36 10.1 NaN NaN 0.0785
## 2 F_NAIVE Test -6138. 8941. 7840. -9.39 11.4 NaN NaN 0.0545
## 3 F_SNAIVE Test -9204. 11802. 10312. -13.7 15.0 NaN NaN -0.0629
## 4 F_MEAN Test -39360. 39894. 39360. -55.9 55.9 NaN NaN 0.0545
Drift method did best.
Check the residuals of your preferred method. Do they resemble white noise?
fit %>%
select(F_DRIFT) %>%
gg_tsresiduals()Innovation residuals do resemble white noise, but ACF and residual distribution suggest that calculated confidence interval will most likely not be correct.
Create a training set for household wealth (hh_budget) by withholding the last four years as a test set.
tt_split <- time_series_split(
dataset = aus_wealth,
train_size = nrow(aus_wealth) - 4
)Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
fit <- tt_split$train %>%
model(
F_MEAN = MEAN(Wealth),
F_NAIVE = NAIVE(Wealth),
F_SNAIVE = SNAIVE(Wealth ~ lag(2)),
F_DRIFT = RW(Wealth ~ drift())
)Compute the accuracy of your forecasts. Which method does best?
Which method did best?
accuracy(fc, tt_split$test) %>%
arrange(RMSE)## # A tibble: 4 × 11
## .model Country .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 F_DRIFT Australia Test 29.1 35.5 29.1 7.23 7.23 NaN NaN 0.210
## 2 F_NAIVE Australia Test 34.7 41.5 34.7 8.64 8.64 NaN NaN 0.216
## 3 F_MEAN Australia Test 35.7 42.3 35.7 8.89 8.89 NaN NaN 0.216
## 4 F_SNAIVE Australia Test 52.3 56.4 52.3 13.3 13.3 NaN NaN -0.0138
Do the residuals from the best method resemble white noise?
No, but the good thing is that errors are unocorrelated.
- Create a training set for Australian takeaway food turnover (aus_retail) by withholding the last four years as a test set.
- Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
- Compute the accuracy of your forecasts. Which method does best?
- Do the residuals from the best method resemble white noise?
## # A tibble: 4 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 F_DRIFT Test 19.3 66.5 45.0 1.02 2.63 NaN NaN -0.200
## 2 F_SNAIVE Test 34.2 70.7 39.8 1.94 2.29 NaN NaN -0.116
## 3 F_NAIVE Test 27.4 71.9 46.6 1.52 2.71 NaN NaN -0.173
## 4 F_MEAN Test 910. 913. 910. 55.3 55.3 NaN NaN -0.173
No, series do not resemble white noise.
We will use the Bricks data from aus_production (Australian quarterly clay brick production > 1956–2005) for this exercise.
- Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
- Compute and plot the seasonally adjusted data.
- Use a naïve method to produce forecasts of the seasonally adjusted data.
- Use decomposition_model() to reseasonalise the results, giving forecasts for the original data.
- Do the residuals look uncorrelated?
- Repeat with a robust STL decomposition. Does it make much difference?
- Compare forecasts from decomposition_model() with those from SNAIVE(), using a test set comprising the last 2 years of data. Which is better?